2,310 research outputs found
Diversity of O Antigens within the Genus Cronobacter: from Disorder to Order
Cronobacter species are Gram-negative opportunistic pathogens that can cause serious infections in neonates. The lipopolysaccharides (LPSs) that form part of the outer membrane of such bacteria are possibly related to the virulence of particular bacterial strains. However, currently there is no clear overview of O-antigen diversity within the various Cronobacter strains and links with virulence. In this study, we tested a total of 82 strains, covering each of the Cronobacter species. The nucleotide variability of the O-antigen gene cluster was determined by restriction fragment length polymorphism (RFLP) analysis. As a result, the 82 strains were distributed into 11 previously published serotypes and 6 new serotypes, each defined by its characteristic restriction profile. These new serotypes were confirmed using genomic analysis of strains available in public databases: GenBank and PubMLST Cronobacter. Laboratory strains were then tested using the current serotype-specific PCR probes. The results show that the current PCR probes did not always correspond to genomic O-antigen gene cluster variation. In addition, we analyzed the LPS phenotype of the reference strains of all distinguishable serotypes. The identified serotypes were compared with data from the literature and the MLST database (www.pubmlst.org/cronobacter/). Based on the findings, we systematically classified a total of 24 serotypes for the Cronobacter genus. Moreover, we evaluated the clinical history of these strains and show that Cronobacter sakazakii O2, O1, and O4, C. turicensis O1, and C. malonaticus O2 serotypes are particularly predominant in clinical cases
Statistical Methods for Detecting Differentially Abundant Features in Clinical Metagenomic Samples
Numerous studies are currently underway to characterize the microbial communities inhabiting our world. These studies aim to dramatically expand our understanding of the microbial biosphere and, more importantly, hope to reveal the secrets of the complex symbiotic relationship between us and our commensal bacterial microflora. An important prerequisite for such discoveries are computational tools that are able to rapidly and accurately compare large datasets generated from complex bacterial communities to identify features that distinguish them
Interpreting 16S metagenomic data without clustering to achieve sub-OTU resolution
The standard approach to analyzing 16S tag sequence data, which relies on
clustering reads by sequence similarity into Operational Taxonomic Units
(OTUs), underexploits the accuracy of modern sequencing technology. We present
a clustering-free approach to multi-sample Illumina datasets that can identify
independent bacterial subpopulations regardless of the similarity of their 16S
tag sequences. Using published data from a longitudinal time-series study of
human tongue microbiota, we are able to resolve within standard 97% similarity
OTUs up to 20 distinct subpopulations, all ecologically distinct but with 16S
tags differing by as little as 1 nucleotide (99.2% similarity). A comparative
analysis of oral communities of two cohabiting individuals reveals that most
such subpopulations are shared between the two communities at 100% sequence
identity, and that dynamical similarity between subpopulations in one host is
strongly predictive of dynamical similarity between the same subpopulations in
the other host. Our method can also be applied to samples collected in
cross-sectional studies and can be used with the 454 sequencing platform. We
discuss how the sub-OTU resolution of our approach can provide new insight into
factors shaping community assembly.Comment: Updated to match the published version. 12 pages, 5 figures +
supplement. Significantly revised for clarity, references added, results not
change
jMOTU and Taxonerator: Turning DNA Barcode Sequences into Annotated Operational Taxonomic Units
BACKGROUND: DNA barcoding and other DNA sequence-based techniques for investigating and estimating biodiversity require explicit methods for associating individual sequences with taxa, as it is at the taxon level that biodiversity is assessed. For many projects, the bioinformatic analyses required pose problems for laboratories whose prime expertise is not in bioinformatics. User-friendly tools are required for both clustering sequences into molecular operational taxonomic units (MOTU) and for associating these MOTU with known organismal taxonomies. RESULTS: Here we present jMOTU, a Java program for the analysis of DNA barcode datasets that uses an explicit, determinate algorithm to define MOTU. We demonstrate its usefulness for both individual specimen-based Sanger sequencing surveys and bulk-environment metagenetic surveys using long-read next-generation sequencing data. jMOTU is driven through a graphical user interface, and can analyse tens of thousands of sequences in a short time on a desktop computer. A companion program, Taxonerator, that adds traditional taxonomic annotation to MOTU, is also presented. Clustering and taxonomic annotation data are stored in a relational database, and are thus amenable to subsequent data mining and web presentation. CONCLUSIONS: jMOTU efficiently and robustly identifies the molecular taxa present in survey datasets, and Taxonerator decorates the MOTU with putative identifications. jMOTU and Taxonerator are freely available from http://www.nematodes.org/
A statistical toolbox for metagenomics: assessing functional diversity in microbial communities
<p>Abstract</p> <p>Background</p> <p>The 99% of bacteria in the environment that are recalcitrant to culturing have spurred the development of metagenomics, a culture-independent approach to sample and characterize microbial genomes. Massive datasets of metagenomic sequences have been accumulated, but analysis of these sequences has focused primarily on the descriptive comparison of the relative abundance of proteins that belong to specific functional categories. More robust statistical methods are needed to make inferences from metagenomic data. In this study, we developed and applied a suite of tools to describe and compare the richness, membership, and structure of microbial communities using peptide fragment sequences extracted from metagenomic sequence data.</p> <p>Results</p> <p>Application of these tools to acid mine drainage, soil, and whale fall metagenomic sequence collections revealed groups of peptide fragments with a relatively high abundance and no known function. When combined with analysis of 16S rRNA gene fragments from the same communities these tools enabled us to demonstrate that although there was no overlap in the types of 16S rRNA gene sequence observed, there was a core collection of operational protein families that was shared among the three environments.</p> <p>Conclusion</p> <p>The results of comparisons between the three habitats were surprising considering the relatively low overlap of membership and the distinctively different characteristics of the three habitats. These tools will facilitate the use of metagenomics to pursue statistically sound genome-based ecological analyses.</p
The Effects of Alignment Quality, Distance Calculation Method, Sequence Filtering, and Region on the Analysis of 16S rRNA Gene-Based Studies
Pyrosequencing of PCR-amplified fragments that target variable regions within the 16S rRNA gene has quickly become a powerful method for analyzing the membership and structure of microbial communities. This approach has revealed and introduced questions that were not fully appreciated by those carrying out traditional Sanger sequencing-based methods. These include the effects of alignment quality, the best method of calculating pairwise genetic distances for 16S rRNA genes, whether it is appropriate to filter variable regions, and how the choice of variable region relates to the genetic diversity observed in full-length sequences. I used a diverse collection of 13,501 high-quality full-length sequences to assess each of these questions. First, alignment quality had a significant impact on distance values and downstream analyses. Specifically, the greengenes alignment, which does a poor job of aligning variable regions, predicted higher genetic diversity, richness, and phylogenetic diversity than the SILVA and RDP-based alignments. Second, the effect of different gap treatments in determining pairwise genetic distances was strongly affected by the variation in sequence length for a region; however, the effect of different calculation methods was subtle when determining the sample's richness or phylogenetic diversity for a region. Third, applying a sequence mask to remove variable positions had a profound impact on genetic distances by muting the observed richness and phylogenetic diversity. Finally, the genetic distances calculated for each of the variable regions did a poor job of correlating with the full-length gene. Thus, while it is tempting to apply traditional cutoff levels derived for full-length sequences to these shorter sequences, it is not advisable. Analysis of ÎČ-diversity metrics showed that each of these factors can have a significant impact on the comparison of community membership and structure. Taken together, these results urge caution in the design and interpretation of analyses using pyrosequencing data
The nasal cavity microbiota of healthy adults
Abstract
Background
The microbiota of the nares has been widely studied. However, relatively few studies have investigated the microbiota of the nasal cavity posterior to the nares. This distinct environment has the potential to contain a distinct microbiota and play an important role in health.
Results
We obtained 35,142 high-quality bacterial 16S rRNA-encoding gene sequence reads from the nasal cavity and oral cavity (the dorsum of the tongue and the buccal mucosa) of 12 healthy adult humans and deposited these data in the Sequence Read Archive (SRA) of the National Center for Biotechnology Information (NCBI) (Bioproject: PRJNA248297). In our initial analysis, we compared the bacterial communities of the nasal cavity and the oral cavity from ten of these subjects. The nasal cavity bacterial communities were dominated by Actinobacteria, Firmicutes, and Proteobacteria and were statistically distinct from those on the tongue and buccal mucosa. For example, the same Staphylococcaceae operational taxonomic unit (OTU) was present in all of the nasal cavity samples, comprising up to 55% of the community, but Staphylococcaceae was comparatively uncommon in the oral cavity.
Conclusions
There are clear differences between nasal cavity microbiota and oral cavity microbiota in healthy adults. This study expands our knowledge of the nasal cavity microbiota and the relationship between the microbiota of the nasal and oral cavities.http://deepblue.lib.umich.edu/bitstream/2027.42/109547/1/40168_2014_Article_56.pd
Robust estimation of microbial diversity in theory and in practice
Quantifying diversity is of central importance for the study of structure,
function and evolution of microbial communities. The estimation of microbial
diversity has received renewed attention with the advent of large-scale
metagenomic studies. Here, we consider what the diversity observed in a sample
tells us about the diversity of the community being sampled. First, we argue
that one cannot reliably estimate the absolute and relative number of microbial
species present in a community without making unsupported assumptions about
species abundance distributions. The reason for this is that sample data do not
contain information about the number of rare species in the tail of species
abundance distributions. We illustrate the difficulty in comparing species
richness estimates by applying Chao's estimator of species richness to a set of
in silico communities: they are ranked incorrectly in the presence of large
numbers of rare species. Next, we extend our analysis to a general family of
diversity metrics ("Hill diversities"), and construct lower and upper estimates
of diversity values consistent with the sample data. The theory generalizes
Chao's estimator, which we retrieve as the lower estimate of species richness.
We show that Shannon and Simpson diversity can be robustly estimated for the in
silico communities. We analyze nine metagenomic data sets from a wide range of
environments, and show that our findings are relevant for empirically-sampled
communities. Hence, we recommend the use of Shannon and Simpson diversity
rather than species richness in efforts to quantify and compare microbial
diversity.Comment: To be published in The ISME Journal. Main text: 16 pages, 5 figures.
Supplement: 16 pages, 4 figure
The unifrac significance test is sensitive to tree topology
Long et al. (BMC Bioinformatics 2014, 15(1):278) describe a âdiscrepancyâ in using UniFrac to assess statistical significance of community differences. Specifically, they find that weighted UniFrac results differ between input trees where (a) replicate sequences each have their own tip, or (b) all replicates are assigned to one tip with an associated count. We argue that these are two distinct cases that differ in the probability distribution on which the statistical test is based, because of the differences in tree topology. Further study is needed to understand which randomization procedure best detects different aspects of community dissimilarities
- âŠ